首页> 外文OA文献 >Scene Flow to Action Map: A New Representation for RGB-D based Action Recognition with Convolutional Neural Networks
【2h】

Scene Flow to Action Map: A New Representation for RGB-D based Action Recognition with Convolutional Neural Networks

机译:场景流向动作图:基于RGB-D的动作的新表示   用卷积神经网络识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Scene flow describes the motion of 3D objects in real world and potentiallycould be the basis of a good feature for 3D action recognition. However, itsuse for action recognition, especially in the context of convolutional neuralnetworks (ConvNets), has not been previously studied. In this paper, we proposethe extraction and use of scene flow for action recognition from RGB-D data.Previous works have considered the depth and RGB modalities as separatechannels and extract features for later fusion. We take a different approachand consider the modalities as one entity, thus allowing feature extraction foraction recognition at the beginning. Two key questions about the use of sceneflow for action recognition are addressed: how to organize the scene flowvectors and how to represent the long term dynamics of videos based on sceneflow. In order to calculate the scene flow correctly on the available datasets,we propose an effective self-calibration method to align the RGB and depth dataspatially without knowledge of the camera parameters. Based on the scene flowvectors, we propose a new representation, namely, Scene Flow to Action Map(SFAM), that describes several long term spatio-temporal dynamics for actionrecognition. We adopt a channel transform kernel to transform the scene flowvectors to an optimal color space analogous to RGB. This transformation takesbetter advantage of the trained ConvNets models over ImageNet. Experimentalresults indicate that this new representation can surpass the performance ofstate-of-the-art methods on two large public datasets.
机译:场景流描述了现实世界中3D对象的运动,并且可能成为3D动作识别的良好功能的基础。但是,其在动作识别中的用途,特别是在卷积神经网络(ConvNets)的情况下,以前尚未进行过研究。在本文中,我们提出了从RGB-D数据中提取和使用场景流进行动作识别的方法。以前的工作已经将深度和RGB模式视为单独的通道,并提取了特征以供以后融合。我们采用了不同的方法,并将这些模式视为一个实体,因此从一开始就允许特征提取以进行动作识别。解决了有关使用场景流进行动作识别的两个关键问题:如何组织场景流向量以及如何基于场景流表示视频的长期动态。为了在可用数据集上正确地计算场景流,我们提出了一种有效的自校准方法,可以在不了解相机参数的情况下在空间上对齐RGB和深度数据。基于场景流向量,我们提出了一种新的表示形式,即“场景流到动作图”(SFAM),它描述了一些用于动作识别的长期时空动态。我们采用通道变换内核将场景流矢量变换为类似于RGB的最佳色彩空间。这种转换利用了经过培训的ConvNets模型优于ImageNet的优势。实验结果表明,这种新的表示形式可以在两个大型公共数据集上超越最新方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号